Project Area

Public Transportation

Members

Jenny, Pippa, Rita

Research question

  • Where are there gaps in a public transportation system?
  • What are the connections between public transport systems and environmental protection and emissions?
  • How much money the government spends (or future spends) on transportation based on energy cost ?

Dataset

library(tidyverse)     # for graphing and data cleaning
library(lubridate)     # for date manipulation
library(ggthemes)      # for even more plotting themes
library(dplyr)
library(maps)          # for map data
library(ggmap)         # for mapping points on maps
library(RColorBrewer)  # for color palettes
library(sf)            # for working with spatial data
library(readr)
library(readxl)
library(imputeTS)
library(plotly)
Sys.setlocale("LC_TIME", "English")
## [1] "English_United States.1252"
theme_set(theme_minimal())

Data 1

calendar<-read.delim('calendar_dates.txt', sep=',')
stop_times<-read.delim('stop_times.txt', sep=',')
stops<-read.delim('stops.txt', sep=',')
trips<-read.delim('trips.txt', sep=',')
routes<-read.delim('routes.txt', sep=',')

Clean Data

newTrips <- trips %>%
  select(-trip_long_name:-bikes_allowed)
newStop <- stops %>% 
  select(-location_type:-zone_id)
newRoutes<-routes %>% 
  select(c(route_id,route_long_name))
newStopTimes<- stop_times %>% 
  select(-stop_headsign) %>% 
  select(-pickup_type:-fare_units_traveled) 
newCalendar <- calendar %>% 
  select(-exception_type)

Combine Data

walloonTransit<-newStopTimes %>% 
  left_join(newStop,by="stop_id") %>% 
  left_join(newTrips, by="trip_id") %>% 
  left_join(newCalendar, by="service_id") %>% 
  left_join(newRoutes, by="route_id") %>% 
  filter(date > 20180731& arrival_time<"06:00:00")

Data Source

TEC

Data Description

Transit data such as stop locations, scheduled times and dates, etc.

Data Limitation

The data was collected 4 years ago. Thus is a little outdated and the data does not contain the measure passenger flow. Also, since the dataset is too large, I decided to limit it to the early hours of each day in August.

Variables

head(walloonTransit)

stop_lon and stop_lat including the arrival_time of the trips are interesting. The stop_lon and stop_lat are basically the lontitude and latitude of each stop and arrival_time is the time when the train arrivals at the stop.

Visualization

mapStops <- get_stamenmap(
  bbox = c(left = 2, bottom = 49.4, right = 6, top = 51), 
  maptype = "toner",
  zoom = 12)

ggmap(mapStops) +
  geom_point(
    data = walloonTransit,
    aes(x = stop_lon, y = stop_lat),
    alpha = .3,
    size = .1,
    color = "maroon4"
  ) +
  labs(title = "Stops In Wallon Belgium")

map for the stops

Join Variables

All the variables including “id” would be possible to join with other dataset, such as trip_id or route_id.

Data 2

energyUse<- read_excel("Energy Use.xls", sheet="Data") %>% 
  na_replace(0) %>% pivot_longer(
                cols = c("1960":"2021"),
                names_to = "year",
                values_to = "country_energy_used"
)

Data Source

OECD

Data Description

Energy use (kg of oil equivalent per capita) by countries since 1960 through 2015.

Data Limitation

The data was collected 6 years ago. Thus is a little outdated and the data only contains one kind of energy that is used by countries. In addition, some of the data is missing.

Variables

head(energyUse)

Country Name, year and country_energy_used are the three interesting variables in this dataset. Country Name refers to the name of the country, year refers to year and country_energy_used refers to the use of energy (kg of oil equivalent per capita) by each country.

Visualizaiton

energyUseGraph<- energyUse %>%
  ggplot(aes(x = year, y = country_energy_used)) +
  labs(title = "Energy Comsuption By Countries", y = "Comsuption (kg of oil equivalent per capita)", x =
         NULL) +
  geom_point(aes(color = `Country Name`)) +
  theme(
    plot.title = element_text (hjust = 0.5,
                               face = "bold",
                               size = 11),
    axis.text.x = element_text(
      size = 6,
      angle = -90,
      hjust = 0
    ),
    panel.grid.major.x  = element_blank(),
    panel.grid.minor.x = element_blank(),
    legend.position = 'none'
  )

ggplotly(energyUseGraph, tooltip = c("Country Name","colour") )

Join Variables

Country Name, Country Code and year are possible to join with other dataset.

LS0tDQp0aXRsZTogJ0ZQMScNCmF1dGhvcjogIkplbm55IExpIg0Kb3V0cHV0OiANCiAgaHRtbF9kb2N1bWVudDoNCiAgICBrZWVwX21kOiBUUlVFDQogICAgdG9jOiBUUlVFDQogICAgdG9jX2Zsb2F0OiBUUlVFDQogICAgZGZfcHJpbnQ6IHBhZ2VkDQogICAgY29kZV9kb3dubG9hZDogdHJ1ZQ0KLS0tDQoNCmBgYHtyIHNldHVwLCBpbmNsdWRlPUZBTFNFfQ0Ka25pdHI6Om9wdHNfY2h1bmskc2V0KGVjaG8gPSBUUlVFLCBlcnJvcj1UUlVFLCBtZXNzYWdlPUZBTFNFLCB3YXJuaW5nPUZBTFNFKQ0KYGBgDQojIyBQcm9qZWN0IEFyZWENClB1YmxpYyBUcmFuc3BvcnRhdGlvbg0KDQojIyBNZW1iZXJzDQpKZW5ueSwgUGlwcGEsIFJpdGENCg0KDQojIyBSZXNlYXJjaCBxdWVzdGlvbg0KLSBXaGVyZSBhcmUgdGhlcmUgZ2FwcyBpbiBhIHB1YmxpYyB0cmFuc3BvcnRhdGlvbiBzeXN0ZW0/DQotIFdoYXQgYXJlIHRoZSBjb25uZWN0aW9ucyBiZXR3ZWVuIHB1YmxpYyB0cmFuc3BvcnQgc3lzdGVtcyBhbmQgICAgZW52aXJvbm1lbnRhbCBwcm90ZWN0aW9uIGFuZCBlbWlzc2lvbnM/IA0KLSBIb3cgbXVjaCBtb25leSB0aGUgZ292ZXJubWVudCBzcGVuZHMgKG9yIGZ1dHVyZSBzcGVuZHMpIG9uIHRyYW5zcG9ydGF0aW9uIGJhc2VkIG9uIGVuZXJneSBjb3N0ID8NCg0KIyMgRGF0YXNldA0KYGBge3IgbGlicmFyaWVzfQ0KbGlicmFyeSh0aWR5dmVyc2UpICAgICAjIGZvciBncmFwaGluZyBhbmQgZGF0YSBjbGVhbmluZw0KbGlicmFyeShsdWJyaWRhdGUpICAgICAjIGZvciBkYXRlIG1hbmlwdWxhdGlvbg0KbGlicmFyeShnZ3RoZW1lcykgICAgICAjIGZvciBldmVuIG1vcmUgcGxvdHRpbmcgdGhlbWVzDQpsaWJyYXJ5KGRwbHlyKQ0KbGlicmFyeShtYXBzKSAgICAgICAgICAjIGZvciBtYXAgZGF0YQ0KbGlicmFyeShnZ21hcCkgICAgICAgICAjIGZvciBtYXBwaW5nIHBvaW50cyBvbiBtYXBzDQpsaWJyYXJ5KFJDb2xvckJyZXdlcikgICMgZm9yIGNvbG9yIHBhbGV0dGVzDQpsaWJyYXJ5KHNmKSAgICAgICAgICAgICMgZm9yIHdvcmtpbmcgd2l0aCBzcGF0aWFsIGRhdGENCmxpYnJhcnkocmVhZHIpDQpsaWJyYXJ5KHJlYWR4bCkNCmxpYnJhcnkoaW1wdXRlVFMpDQpsaWJyYXJ5KHBsb3RseSkNClN5cy5zZXRsb2NhbGUoIkxDX1RJTUUiLCAiRW5nbGlzaCIpDQp0aGVtZV9zZXQodGhlbWVfbWluaW1hbCgpKQ0KYGBgDQoNCiMjIyBEYXRhIDENCmBgYHtyIGRhdGExfQ0KY2FsZW5kYXI8LXJlYWQuZGVsaW0oJ2NhbGVuZGFyX2RhdGVzLnR4dCcsIHNlcD0nLCcpDQpzdG9wX3RpbWVzPC1yZWFkLmRlbGltKCdzdG9wX3RpbWVzLnR4dCcsIHNlcD0nLCcpDQpzdG9wczwtcmVhZC5kZWxpbSgnc3RvcHMudHh0Jywgc2VwPScsJykNCnRyaXBzPC1yZWFkLmRlbGltKCd0cmlwcy50eHQnLCBzZXA9JywnKQ0Kcm91dGVzPC1yZWFkLmRlbGltKCdyb3V0ZXMudHh0Jywgc2VwPScsJykNCmBgYA0KDQojIyMjIENsZWFuIERhdGENCmBgYHtyfQ0KbmV3VHJpcHMgPC0gdHJpcHMgJT4lDQogIHNlbGVjdCgtdHJpcF9sb25nX25hbWU6LWJpa2VzX2FsbG93ZWQpDQpuZXdTdG9wIDwtIHN0b3BzICU+JSANCiAgc2VsZWN0KC1sb2NhdGlvbl90eXBlOi16b25lX2lkKQ0KbmV3Um91dGVzPC1yb3V0ZXMgJT4lIA0KICBzZWxlY3QoYyhyb3V0ZV9pZCxyb3V0ZV9sb25nX25hbWUpKQ0KbmV3U3RvcFRpbWVzPC0gc3RvcF90aW1lcyAlPiUgDQogIHNlbGVjdCgtc3RvcF9oZWFkc2lnbikgJT4lIA0KICBzZWxlY3QoLXBpY2t1cF90eXBlOi1mYXJlX3VuaXRzX3RyYXZlbGVkKSANCm5ld0NhbGVuZGFyIDwtIGNhbGVuZGFyICU+JSANCiAgc2VsZWN0KC1leGNlcHRpb25fdHlwZSkNCmBgYA0KDQojIyMjIENvbWJpbmUgRGF0YQ0KYGBge3J9DQp3YWxsb29uVHJhbnNpdDwtbmV3U3RvcFRpbWVzICU+JSANCiAgbGVmdF9qb2luKG5ld1N0b3AsYnk9InN0b3BfaWQiKSAlPiUgDQogIGxlZnRfam9pbihuZXdUcmlwcywgYnk9InRyaXBfaWQiKSAlPiUgDQogIGxlZnRfam9pbihuZXdDYWxlbmRhciwgYnk9InNlcnZpY2VfaWQiKSAlPiUgDQogIGxlZnRfam9pbihuZXdSb3V0ZXMsIGJ5PSJyb3V0ZV9pZCIpICU+JSANCiAgZmlsdGVyKGRhdGUgPiAyMDE4MDczMSYgYXJyaXZhbF90aW1lPCIwNjowMDowMCIpDQpgYGANCg0KIyMjIyBEYXRhIFNvdXJjZQ0KW1RFQ10oaHR0cDovL2d0ZnMub3ZhcGkubmwvdGVjL2d0ZnMtdGVjLWxhdGVzdC56aXApDQoNCiMjIyMgRGF0YSBEZXNjcmlwdGlvbg0KVHJhbnNpdCBkYXRhIHN1Y2ggYXMgc3RvcCBsb2NhdGlvbnMsIHNjaGVkdWxlZCB0aW1lcyBhbmQgZGF0ZXMsIGV0Yy4NCg0KIyMjIyBEYXRhIExpbWl0YXRpb24NClRoZSBkYXRhIHdhcyBjb2xsZWN0ZWQgNCB5ZWFycyBhZ28uIFRodXMgaXMgYSBsaXR0bGUgb3V0ZGF0ZWQgYW5kIHRoZSBkYXRhIGRvZXMgbm90IGNvbnRhaW4gdGhlIG1lYXN1cmUgcGFzc2VuZ2VyIGZsb3cuIEFsc28sIHNpbmNlIHRoZSBkYXRhc2V0IGlzIHRvbyBsYXJnZSwgSSBkZWNpZGVkIHRvIGxpbWl0IGl0IHRvIHRoZSBlYXJseSBob3VycyBvZiBlYWNoIGRheSBpbiBBdWd1c3QuDQoNCiMjIyMgVmFyaWFibGVzDQpgYGB7cn0NCmhlYWQod2FsbG9vblRyYW5zaXQpDQpgYGANCg0Kc3RvcF9sb24gYW5kIHN0b3BfbGF0IGluY2x1ZGluZyB0aGUgYXJyaXZhbF90aW1lIG9mIHRoZSB0cmlwcyBhcmUgaW50ZXJlc3RpbmcuIFRoZSBzdG9wX2xvbiBhbmQgc3RvcF9sYXQgYXJlIGJhc2ljYWxseSB0aGUgbG9udGl0dWRlIGFuZCBsYXRpdHVkZSBvZiBlYWNoIHN0b3AgYW5kIGFycml2YWxfdGltZSBpcyB0aGUgdGltZSB3aGVuIHRoZSB0cmFpbiBhcnJpdmFscyBhdCB0aGUgc3RvcC4NCg0KIyMjIyBWaXN1YWxpemF0aW9uDQpgYGB7ciwgZmlnLmFsdD0ibWFwIGZvciB0aGUgc3RvcHMifQ0KbWFwU3RvcHMgPC0gZ2V0X3N0YW1lbm1hcCgNCiAgYmJveCA9IGMobGVmdCA9IDIsIGJvdHRvbSA9IDQ5LjQsIHJpZ2h0ID0gNiwgdG9wID0gNTEpLCANCiAgbWFwdHlwZSA9ICJ0b25lciIsDQogIHpvb20gPSAxMikNCg0KZ2dtYXAobWFwU3RvcHMpICsNCiAgZ2VvbV9wb2ludCgNCiAgICBkYXRhID0gd2FsbG9vblRyYW5zaXQsDQogICAgYWVzKHggPSBzdG9wX2xvbiwgeSA9IHN0b3BfbGF0KSwNCiAgICBhbHBoYSA9IC4zLA0KICAgIHNpemUgPSAuMSwNCiAgICBjb2xvciA9ICJtYXJvb240Ig0KICApICsNCiAgbGFicyh0aXRsZSA9ICJTdG9wcyBJbiBXYWxsb24gQmVsZ2l1bSIpDQpgYGANCg0KIyMjIyBKb2luIFZhcmlhYmxlcw0KQWxsIHRoZSB2YXJpYWJsZXMgaW5jbHVkaW5nICJpZCIgd291bGQgYmUgcG9zc2libGUgdG8gam9pbiB3aXRoIG90aGVyIGRhdGFzZXQsIHN1Y2ggYXMgdHJpcF9pZCBvciByb3V0ZV9pZC4NCg0KIyMjIERhdGEgMg0KDQpgYGB7cn0NCmVuZXJneVVzZTwtIHJlYWRfZXhjZWwoIkVuZXJneSBVc2UueGxzIiwgc2hlZXQ9IkRhdGEiKSAlPiUgDQogIG5hX3JlcGxhY2UoMCkgJT4lIHBpdm90X2xvbmdlcigNCiAgICAgICAgICAgICAgICBjb2xzID0gYygiMTk2MCI6IjIwMjEiKSwNCiAgICAgICAgICAgICAgICBuYW1lc190byA9ICJ5ZWFyIiwNCiAgICAgICAgICAgICAgICB2YWx1ZXNfdG8gPSAiY291bnRyeV9lbmVyZ3lfdXNlZCINCikNCmBgYA0KDQojIyMjIERhdGEgU291cmNlDQpbT0VDRF0oaHR0cHM6Ly9kYXRhLndvcmxkYmFuay5vcmcvaW5kaWNhdG9yL0VHLlVTRS5QQ0FQLktHLk9FKQ0KDQojIyMjIERhdGEgRGVzY3JpcHRpb24NCkVuZXJneSB1c2UgKGtnIG9mIG9pbCBlcXVpdmFsZW50IHBlciBjYXBpdGEpIGJ5IGNvdW50cmllcyBzaW5jZSAxOTYwIHRocm91Z2ggMjAxNS4NCg0KIyMjIyBEYXRhIExpbWl0YXRpb24NClRoZSBkYXRhIHdhcyBjb2xsZWN0ZWQgNiB5ZWFycyBhZ28uIFRodXMgaXMgYSBsaXR0bGUgb3V0ZGF0ZWQgYW5kIHRoZSBkYXRhIG9ubHkgY29udGFpbnMgb25lIGtpbmQgb2YgZW5lcmd5IHRoYXQgaXMgdXNlZCBieSBjb3VudHJpZXMuIEluIGFkZGl0aW9uLCBzb21lIG9mIHRoZSBkYXRhIGlzIG1pc3NpbmcuDQoNCiMjIyMgVmFyaWFibGVzDQpgYGB7cn0NCmhlYWQoZW5lcmd5VXNlKQ0KYGBgDQoNCkNvdW50cnkgTmFtZSwgeWVhciBhbmQgY291bnRyeV9lbmVyZ3lfdXNlZCBhcmUgdGhlIHRocmVlIGludGVyZXN0aW5nIHZhcmlhYmxlcyBpbiB0aGlzIGRhdGFzZXQuIENvdW50cnkgTmFtZSByZWZlcnMgdG8gdGhlIG5hbWUgb2YgdGhlIGNvdW50cnksIHllYXIgcmVmZXJzIHRvIHllYXIgYW5kIGNvdW50cnlfZW5lcmd5X3VzZWQgcmVmZXJzIHRvIHRoZSB1c2Ugb2YgZW5lcmd5IChrZyBvZiBvaWwgZXF1aXZhbGVudCBwZXIgY2FwaXRhKSBieSBlYWNoIGNvdW50cnkuDQoNCiMjIyMgVmlzdWFsaXphaXRvbg0KYGBge3IsIGZpZy5hbHQ9IkVuZXJneSBDb21zdXB0aW9uIEJ5IENvdW50cmllcyJ9DQplbmVyZ3lVc2VHcmFwaDwtIGVuZXJneVVzZSAlPiUNCiAgZ2dwbG90KGFlcyh4ID0geWVhciwgeSA9IGNvdW50cnlfZW5lcmd5X3VzZWQpKSArDQogIGxhYnModGl0bGUgPSAiRW5lcmd5IENvbXN1cHRpb24gQnkgQ291bnRyaWVzIiwgeSA9ICJDb21zdXB0aW9uIChrZyBvZiBvaWwgZXF1aXZhbGVudCBwZXIgY2FwaXRhKSIsIHggPQ0KICAgICAgICAgTlVMTCkgKw0KICBnZW9tX3BvaW50KGFlcyhjb2xvciA9IGBDb3VudHJ5IE5hbWVgKSkgKw0KICB0aGVtZSgNCiAgICBwbG90LnRpdGxlID0gZWxlbWVudF90ZXh0IChoanVzdCA9IDAuNSwNCiAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICBmYWNlID0gImJvbGQiLA0KICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgIHNpemUgPSAxMSksDQogICAgYXhpcy50ZXh0LnggPSBlbGVtZW50X3RleHQoDQogICAgICBzaXplID0gNiwNCiAgICAgIGFuZ2xlID0gLTkwLA0KICAgICAgaGp1c3QgPSAwDQogICAgKSwNCiAgICBwYW5lbC5ncmlkLm1ham9yLnggID0gZWxlbWVudF9ibGFuaygpLA0KICAgIHBhbmVsLmdyaWQubWlub3IueCA9IGVsZW1lbnRfYmxhbmsoKSwNCiAgICBsZWdlbmQucG9zaXRpb24gPSAnbm9uZScNCiAgKQ0KDQpnZ3Bsb3RseShlbmVyZ3lVc2VHcmFwaCwgdG9vbHRpcCA9IGMoIkNvdW50cnkgTmFtZSIsImNvbG91ciIpICkNCmBgYA0KDQojIyMjIEpvaW4gVmFyaWFibGVzDQpDb3VudHJ5IE5hbWUsIENvdW50cnkgQ29kZSBhbmQgeWVhciBhcmUgcG9zc2libGUgdG8gam9pbiB3aXRoIG90aGVyIGRhdGFzZXQu